PHN 17.536 



1 



1 14.07.1999 
Efficient coding of side information in a lossless encoder. 



The invention relates to an apparatus for lossless encoding of a digital 
information signal, for a lossless encoding methodi to an apparatus for decoding and to a 
record carrier. 

5 For "Super Audio CD" (SACD) the DSD signals are losslessly coded, using 

framing, prediction and entropy coding. Besides the efficiently encoded signals, a large 
number of parameters, i.e. the side-information, has to be stored on the SACD too. The 
smaller the storage capacity that is required for the side-information, the better the overall 
coding gain is. Therefore coding techniques are applied to the side-information too. 

10 A description of the lossless encoding of DSD signals is given in the publication 'Improved 
lossless coding of 1-bit audio signals', by F. Bruekers et al, preprint 4563(1-6) presented at the 
103 rd convention of the AES, September 26-29, 1997 in New York. 



The invention aims at providing methods that can be used e.g. in SACD to save 
15 on the number of bits that have to be used for storing the side-information. 
In the following description those methods will be presented. 



These and other aspects of the invention will be further explained hereafter in 

the figure description, in which 
20 figure la shows a circuit diagram of a lossless encoder and figure lb shows a 

circuit diagram of a corresponding decoder, using linear prediction and arithmetic coding, 
figure 2 shows subsequent frames of a multi channel information signal, 
figure 3 shows the segmentation of time equivalent frames of the multi channel 

information signal, and 

25 figure 4 shows the contents of a frame of the output signal of the encoding 

apparatus. 



The process of lossless encoding and decoding, for the example of 1-bit 
oversampled audio signals, will be explained briefly hereafter by means of figure 1, which 
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shows an embodiment of the encoder apparatus in figure la and shows an embodiment of the 
decoder apparatus in figure lb. 

The lossless coding in the apparatus of figure la is performed on isolated parts 
(frames) of the audio signal. A typical length of such a frame is 37632 bits. The two possible 
5 bit-values of the input signal F, T and l 0\ represent the sample values +1 and -1 respectively. 
Per frame, the set of coefficients for the prediction filter z °.A(z), denoted by 4, is determined 
in a filter coefficient generator unit 12, by e.g. the autocorrelation method. The sign of the 
filter output signal, Z, determines the value of the predicted bit F p , whereas the magnitude of 
the filter output signal, Z, is ah indication for the probability that the prediction is correct. 

10 Upon quantizing the filter output signal Z in a quantizer 10, a predicted input signal F p is 
obtained, which is ex-ored in a combining unit 2, resulting in a residual signal E. A correct 
prediction, or F = F p , is equivalent to E = 0 in the residual signal E. The content of the 
probability table, p(|.|), is designed per frame such that per possible value of Z, p 0 is the 
probability that E = 0. For small values of |Z| the probability for a correct prediction is close to 

15 0.5 and for large values of |Z| the probability for a correct prediction is close to 1.0. Clearly the 
probability for an incorrect prediction, F * F p or E=l, is pi =l-po. 

The probability tables for the frames (or segments, to be described later) are 
determined by the unit 13. Using this probability table, supplied by the unit 13 to the unit 8, 
the unit 8 generates a probability value Po in response to its input signal, which is the signal Z. 

20 The arithmetic encoder (AC Enc.) in the apparatus of figure la, denoted by 6, 

codes the sequence of bits of E such that the code (D) requires less bits. For this, the arithmetic 
coder uses the probability that bit n of signal E, E[n], has a particular value. The number of 
bits to code the bit E[n]=0 is: 

25 d n = - 2 log(p 0 ) + £(bits) (Eq.l) 

which is practically not more than 1 bit, since p 0 £ 1/2. The number of bits to code the bit 
E[n]=l is: 

30 d n = - 2 log(p, ) + e = - 2 log(l-po ) + £ (bits) (Eq. 2) 

which is not less than 1 bit. The e in both equations represents the non-optimal behavior of the 
arithmetic coder, but can be neglected in practice. 
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A correct prediction (E[n]=0) results in less than 1 bit and an incorrect 
prediction (E[n]=l) results in more than 1 bit in the code (D). The probability table is designed 
such that on the average for the complete frame, the number of bits for code D is minimal. 

. Besides code D, also the coefficients of the prediction filter 4, generated by the 
5 coefficient generator unit 12, and the content of the probability table, generated by the 

probability table determining unit 13, have to be transmitted from encoder to decoder. To that 
purpose, the encoder apparatus comprises a multiplexer unit 14, which receives the output 
signal of the coder 6, as well as side information from the generator units 12 and 13. This side 
information comprises the prediction filter coefficients and the probability table. The 
10 multiplexer unit 14 supplies the serial datastream of information to a transmission medium, 
such as a record carrier. 

* In the decoder apparatus of figure lb, exactly the inverse of the encoder process 
is implemented thus creating a lossless coding system. The demultiplexer unit 20 receives the 
serial datastream comprising the data D and the side information. It retrieves the data D 
15 therefrom and supplies the data D to an arithmetic decoder 22. The arithmetic decoder (AC 
Dec.) is provided with the identical probabilities as the arithmetic encoder was, to retrieve the 
correct values of signal E. Therefore the demultiplexer unit retrieves the same prediction filter 
coefficients and probability table as used the encoder from the serial datastream received and 
supplies the prediction filter coefficients to the prediction filter 24 and the probability table to 
20 the probability value generator unit 26. 

The circuit constructions shown in figure 1 are meant for encoding/decoding a 
single serial datastream of information. Encoding/decoding a multi channel information signal, 
such as a multi channel digital audio signal, requires the processing described above with 
reference to figure 1 to be carried out in time multiplex by the circuits of figure 1, or can be 
25 carried out in parallel by a plurality of such circuits. Another solution can be found in 

international patent application IB 99/00313, which corresponds to US ser. no. 09/268252 
(PHN 16.805). 

It should be noted here that in accordance with the invention, the encoding 
apparatus may be devoid of the quantizer Q and the combining unit 2. Reference is made to 
30 earlier patent publications discussing this. 

In SACD the 1-bit audio channels are chopped into frames of constant length and per frame 
the optimal strategy for coding will be used. Frames can be decoded independently from 
neighbouring frames. Therefore we can discuss the data structure within a single frame. 
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Figure 2 shows time equivalent frames B of two channel signals, such as the 
left and right hand signal component of a digital stereo audio signal, indicated by B(l,m- 

1), B(l,m), B(l,m+1), for the left hand signal component and by B(r,m-1), B(r,m), 

B(r,m+1), ..... for the right hand signal component. The frames can be segmented, as will be 
5 explained hereafter. If not segmented, the frames will be encoded in their entirety, with one set 
of filter coefficients and one probability table for the complete frame. If segmented, each 
segment in a frame can have its own set of filter coefficients and probability table. 
Furthermore, the segmentation in a frame for the filtercoefficients need not be the same as for 
the probability tables. As an example, figure 3 shows the two time equivalent frames B(l,m) 

10 and B(r,m) of the two channel signals being segmented. The frame B(l,m) has been segmented 
into three segments fs(l,l), fs(l,2) and fs(l,3) in order to carry out three different prediction 
filterings' in the frame. It should however be noted that the filterings in two segments, such as 
the segments fs(l,l) and fs(l,3) can be the same. The frame B(l,m) has further been segmented 
into two segments ps(l,l) and ps(l,2) in order to have two different probability tables for those 

15 segments. 

The frame B(r,m) has been segmented into three segments fs(r,l), fs(r,2) and 
fs(r,3) in order to carry out three different prediction filterings in the frame. It should however 
again be noted that the filterings in two segments, such as the segments fs(r,l) and fs(r,3) can 
be the same. The frame B(r,m) has further been segmented into four segments ps(r,l), ps(r,2), 

20 ps(r,3) and ps(r,4) in order to have four different probability tables for those segments. Again, 
it should be noted that some of the segments can have the same probability table. 

The decision to have the same probability table for different segments can be 
taken on beforehand by a user of the apparatus, after having carried out a signal analysis on 
the signals in the segments. Or the apparatus may be capable of carrying out this signal 

25 analysis and decide in response thereto. In some situations, a signal analysis carried out on two 
segments may result in probability tables that differ only slightly. In such situation, it can be 
decided to have one and the same probability table for both segments. This one probability 
table could be equal to one of the two probability tables established for the two segments, or 
could be an averaged version of both tables. An equivalent reasoning is valid for the sets of 

30 filter coefficients in the various segments. 

To summarize: in order to encode a small portion of audio in an audio channel signal, the 
coding algorithm in SACD requires both a prediction filter (the filter) and a probability table 
(the table). 
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For improving the coding gain it can be efficient to use different filters in different channels. 
But also within the same channel is can be beneficial to use different filters. That is why the 
concept of segmentation is introduced. A channel is partitioned into segments and in a 
segment a particular filter is used. Several segments, also from other channels, may use the 
5 same or a different filter. Besides storage of the filters that are used, also information about the 
segments (segmentation) and information about what filter is used in what segment (mapping) 
have to be stored. 

For the tables, the same idea is applicable, however the segmentation and 
mapping may be different from the segmentation and mapping for the filters. In case of equal 

10 segmentation for both filter and table this is indicated. The same idea is used for the mapping. 
If the segmentation for the filters is equal for all channels this is indicate too. The same idea is 
used For the mapping. 

First, a description will be given of the contents of a frame of a transmission 
signal comprising the encoded channel signals and the corresponding side information. Figure 

15 4 shows a schematic drawing of the frame. Apart from synchronization information (not 

shown), the frame comprises two words wi and w 2 , followed by segmentation information on 
the prediction filters. Next a word w 3 is present followed by segmentation information on the 
probability tables. Next follow two words w 4 and w 5 , followed by mapping information on the 
prediction filters. Next, follows a word w 6 , followed by mapping information on the 

20 probability tables. Next follow the filter coefficients and the probability tables, as supplied by 
the generator units 12 and 13, respectively. The frame ends with the data D, supplied by the 
arithmetic encoder 6. 

The word Wi is in this example one bit long and can have the value *0' or V, 
and indicates whether the segment information for the filter coefficients and the probability 

25 tables are the same ( 4 T), or not ('0'). The word w 4 is in this example one bit long and can 
have the value '0* or T, and indicates whether the mapping information for the filter 
coefficients and the probability tables are the same (T), or not ('0'). The word w 2 , again one 
bit long, can have the value '0' or T, and indicates whether the channel signals have the same 
segmentation information for the prediction filter coefficients (T), or not ('0'). The word w 3 

30 (one bit long) can have the value '0' or T, and indicates whether the channel signals have the 
same segmentation information for the probability tables (T), or not CO"). The word w 5 can 
have the value 4 0' or T, and indicates whether the channel signals have the same mapping 
information for the prediction filter coefficients (T), or not ('0'). The word w 6 can have the 
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value '0' or T, and indicates whether the channel signals have the same mapping information 
for the probability tables ( 4 1 ' ), or not ( '0' ). 

First, the representation of the total number of segments S in a frame will be 

described. 

5 To code a number, e.g. the total number of segments in a frame in a particular 

channel signal, a kind of run-length coding is applied. It is important that the code is short for 
small values of S . Since the number of segments in a channel S > 1 > 5 = 0 needs not to be 
coded. In SACD the following codes are used. 

10 Table 1: 



s 


code(5) 


1 


1 


2 


01 


3 


001 


4 


0001 






s 


0< $1 >1 



Remark: Here the "1" is used as delimiter. It is clear that in general the role of 
the "0" and "1" can be interchanged. The basic idea of the delimiter is that a certain sequence 
15 is violated; the sequence of "0V is violated by a *T\ An alternative is e.g. to "inverse" the 
next symbol and <4 no inversion" is used as a delimiter. In this way long constant sequences are 
avoided. An example of inverting sequences that start with an 44 1" is (not used in SACD): 



s 


code(S) 


1 


0 


2 


11 


3 


100 


4 


1011 


5 


10100 


6 


101011 
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Second, the representation of the segment sizes will be described. The length of 
a segment will be expressed in number of bytes of the channel signal. The B bytes in a frame 
of a channel signal are partitioned into 5 segments. For the first 5 - 1 segments the number of 
bytes of each segment has to be specified. For the 5 th segment the number of bytes is specified 
5 implicitly, it is the remaining number of bytes in the channel. The number of bytes in segment 
i, equals 

B i so the number of bytes in the last segment is: 

Since the number of bytes in the first 5-1 segments are multiples of R , the resolution R > 1 , 
10 we define: 

Bj^ bjR and consequently: B s _ x = B-^b { R 

The 5-1 values of b t are stored and jR is stored in a channel only if 5 > 1 and when it is not 
stored already for another channel. 

The number of bits required to store fc, depends on its possible values. 

15 0 < A, < b Lnax with e.g. b itim = Lll- £», 

so the required number of bits to store b { is: 
#bitsfe)=L 2 Iog(6 i , nm )J+l 

This has as advantage that the required number of bits for the segment length 
may decrease for segments at the end of the frame. If restrictions are imposed on e.g. minima) 
20 length of a segment the calculation of the number of bits may be adapted accordingly. The 
number of bits to store the resolution R \v. #bits(tf) 

Third, the representation of the segmentation information in the serial 
datastream will be described. Use will be made of the representations given above under table 
1. This will be illustrated by some examples. 
25 In order to distinguish between filters and probability tables, the subscripts / and ; are used. To 
distinguish between segments in different channels the double argument is used: (channel 
number, segment number). 

Next follows a first example. For a 2-channel case, we have different 
segmentations for filters and probability tables, and the segmentation is different for both 
30 channels. The following table shows the parameters in the stream. 
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Table 2. 



Value 


#bits 


comment 


(w,=)0 


1 


segmentation information for the filters is different 
from the segmentation for the probability tables 






filter segmentation 


(w 2 =)0 


1 


channels have own filter segmentation information 






filter segmentation in channel 0 


(yi =)o 


1 


first bit of code(S/0)) indicating that S/0)>2 




#Wts(J!/) 


resolution for filters 


b/Ofi) 


#bits(£/0,0)) 


first segment in channel 0 has length R/b/Ofi) bytes 


(y 2 =)i 


1 


last bit of code(5/0)) indicating that S/0)=2 






filter segmentation in channel 1 


(yi =) o 


1 


first bit of code(S/l)) indicating that S/l)>2 


bjilfi) 


#bits(fc/l,0)) 


first segment in channel 1 has length #/2>/l,0) bytes 


(y 2 =)0 


1 


second bit of code(S/l)) indicating that S/l)>3 


*/U) 


#bits(fc/l,l)) 


second segment in channel 1 has length R/bj(l,l) 
bytes 


(y 3 =) i 


1 


last bit of code(S/l)) indicating that S/l)=3 






probability table segmentation 


(w 3 =)0 


1 


channels have own table segmentation specification 






probability table segmentation in channel 0 


(yi =) i 


1 


last bit of code(S,(0)) indicating that S,(0)=1 






probability table segmentation in channel 1 


(yi =)0 


1 


first bit of code(Sr(l)) indicating that S,(l)>2 


R t 


#bits(/?,) 


resolution for tables 


Wi.0) 


#bits(Wl,0)) 


first segment in channel 1 has length R t fc f (l,0) bytes 


(ya =)0 


1 


second bit of code(S,(l)) indicating that S,(l)>3 


Wi.i) 


#bits(WU)) 


second segment in channel 1 has length R, 6,(1,1) 
bytes 


(y 3 =)i 


1 


last bit of code(S,(l)) indicating that S,(l)=3 
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In the above table 2, the first combination (yi,y 2 ) equal to (0,1) is the codeword 
code(S) in table 1 above, and indicates that in the channel signal numbered 0 the frame is 
divided into two segments for the purpose of prediction filtering. Further, the combination 
(y 1) y 2 ,y 3 ) equal to (0,0,1) is the codeword code(S) in table 1 above, and indicates that in the 
5 channel signal numbered 1 the frame is divided into three segments for the purpose of 

prediction filtering. Next, we find a combination (y,) equal to (1), which is the first codeword 
in table 1, indicating that the channel signal numbered 0, the frame is not divided for the 
probability table. Finally, we find a combination (yi,y2,y 3 ) equal to (0,0,1), which indicates 
that the frame of the second channel signal is divided into three segments, each with a 
10 corresponding probability table. 

Next, follows another example for a 5-channel case.- It is assumed that for this 
5*?hannel case, we have equal segmentation for filters and tables, and the segmentation is 
equal for all channels. 



Value 


#bits 


comment 


(W!=) 1 


1 


the segmentation information for the prediction 
filters and probability tables is the same 






filter segmentation 


(w 2 =) 1 


1 


channels have equal filter segmentation information 






filter segmentation in channel 0 


(yi =)0 


1 


first bit of code(5/0)) indicating that S/0)>2 


*/ 


#bits(/?/) 


resolution for filters 


*/0,0) 


#bits(fr/0,0)) 


first segment in channel 0 has length Rfb/0fi) bytes 


(y 2 =)0 


1 


second bit of code (5/0)) indicating that S/0)>3 


fr/0,1) 


#bits(fr/0,l)) 


second segment in channel 0 has length R f bj(0 } \) 
bytes 


(y 3 =)i 


1 


last bit of code(S/0)) indicating that S/0)=3 






filter segmentation in channel c 






fc/c,0)=fy(0,0) and ^/c,l)=^/0,l) for l<c<5 






probability table segmentation in channel c 






fc,(c,0)=&/0,0) and fc r (c,l)=fy(0,l) for 0<c<5 



15 
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Remark: The single bits of code(S) interleaved in de segmentation information 

can be interpreted as "another segment will be specified" in case of "0" or "no more segments 

will be specified" in case of "1". 

Next, mapping will be described. 

5 

For each of the segments, all segments of all channels are considered together, 
it has to be specified which filter or table is used. The segments are ordered; first the segments 
of channel 0 followed by the segments of channel 1 and so on. 

10 

The filter or table number for segment 5 , N(s) is defined as: 

jV(0) = 0 
0<N{s)<N^(s) 

15 with (s) , the maximum allowed number for a given segment, defined as: 

^aux ( J ) = 1 + max(N(/)) with 0 < i < s 
The required number of bits to store N(s) equals: 

20 

#to(yV(5)) = L 2 Iog(/V max (5))J+l 

The number of bits that is required to store a filter or table number according to 
this method depends on the set of numbers that already has been assigned. 

25 

If the tables use the same mapping as the filters, which is not always possible, 
this is indicated. Also when all channels use the same mapping this is indicated. 



With two examples the idea will be illustrated. 

30 Example 3 
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Assume that in total we have 7 segments (0 through 6), some segments use the 
same filter and some use a unique filter. Furthermore it is assumed that the tables use the same 
mapping specification as the filters. 



Channel 
number 


Segment 
number 


Filter 
number 


Possible filter 
numbers 


#bits 


0 


0 


0 




0 


r 


1 


0 


Oorl 


1 


i 




1 


Oorl 


1 


l 


3 


2 


0, 1 or 2 


2 


l 


4 


3 


0,1, 2 or 3 ' 


2 


2 


5 


3 


0, 1, 2, 3 or 4 


3 


3 


6 


1 


0, 1,2, 3 or 4 


3 




Total #bits 


12 



5 

Segment number 0 uses filter number 0 per definition, so no bits are needed for 
this specification. Segment number 1 may use an earlier assigned filter (0) or the next higher 
not yet assigned filter (1), so 1 bit is needed for this specification. Segment number 1 uses 
filter number 0 in this example. Segment number 2 may use an earlier assigned filter (0) or the 

10 next higher not yet assigned filter (1) , so 1 bit is needed for this specification. Segment 
number 2 uses filter number 1 in this example. 

Segment number 3 may use an earlier assigned filter (0 or 1) or the next higher 
not yet assigned filter (2) , so 2 bits are needed for this specification. Segment number 3 uses 
filter number 2 in this example. 

15 Segment number 4 may use an earlier assigned filter (0, 1 or 2) or the next 

higher not yet assigned filter (3) , so 2 bits are needed for this specification. Segment number 4 
uses filter number 3 in this example. Segment number 5 may use an earlier assigned filter (0, 
1 , 2 or 3) or the next higher not yet assigned filter (4) , so 3 bits are needed for this 
specification. Segment number 5 uses filter number 3 in this example. 

20 Segment number 6 may use an earlier assigned filter (0, 1, 2 or 3) or the next 

higher not yet assigned filter (4) , so 3 bits are needed for this specification. Segment number 6 
uses filter number 1 in this example. 

In total 12 bits are required to store the mapping. The total number of segments 
(7 segments in this example) is known at this point in the stream. 
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Value 


#bits 


comment 


(W 4 =)l 


1 


probability tables have same mapping information 
as prediction. filters 






prediction filter mapping 


(w 5 =)0 


1 


channels have own filter segmentation 
information 




0 


filter number for segment 0 is 0 per definition 




#bits(JV/l)) 


filter number for segment 1 


m) 


#bits(tf/2)) 


filter number for segment 2 


AT/3) 


#bits(tf/3)) 


filter number for segment 3 


.JV/4) 


#bits(AX4)) 


filter number for segment 4 




m>its(Nj(5)) 


filter number for segment 5 


A&(6) 


#bits(A/X6)) 


filter number for segment 6 






probability table mapping 






AKi)=tf/i) for 0<i<7 



Another example. Assume that in total we have 6 channels each with 1 segment and each 
segment uses the same prediction filter and the same probability table. 

5 



Value 


#bits 


comment 


(w 4 =)l 


1 


probability tables have same mapping information 
as prediction filters 






prediction filter mapping 


(w 5 =) 1 


1 


channels have own filter mapping specification 




0 


filter number for segment 0 is 0 per definition 






prediction filter mapping for segment i 






N/i)=0 for l<i<6 






probability table mapping for segment i 






M(i)=A^(i)forO<i<6 



In total 2 bits are required to store the complete mapping. 
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Remark: A reason to give the indication that a following specification is also 
used for other application (e.g. for tables the same segmentation is used as for the filters) is 
that this simplifies the decoder. 

Whilst the invention has been described with reference to preferred 
5 embodiments thereof, it is to be understood that these are not limitative examples. Thus, 

various modifications may become apparent to those skilled in the art without departing from 
the scope of the invention as defined by the claims. As an example, the invention could also 
have been incorporated in an embodiment in which time equivalent signal blocks are encoded, 
without making use of segmentation. In such embodiment, the serial datastream obtained, like 
10 the datastream of figure 4, will be devoid of the segment information described there for the 
filters and the probability tables, as well as some of the indicator words, such as the indicator 
words W|, w 2 and w 3 . Further, the invention lies in each and every novel feature and 
combination of features. 



